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Abstract — The performance of a novel fountain coding scheme 
based on maximum distance separable (MDS) codes constructed 
over Galois fields of order 5 > 2 is investigated. Upper and lower 
bounds on the decoding failure probability under maximum 
likelihood decoding are developed. Differently from Raptor codes 
(which are based on a serial concatenation of a high-rate outer 
block code, and an inner Luby-transform code), the proposed 
coding scheme can be seen as a parallel concatenation of an 
outer MDS code and an inner random linear fountain code, both 
operating on the same Galois field. A performance assessment 
is performed on the gain provided by MDS based fountain 
coding over linear random fountain coding in terms of decoding 
failure probability vs. overhead. It is shown how, for example, 
the concatenation of a (15, 10) Reed-Solomon code and a linear 
random fountain code over Fie brings to a decoding failure 
probability 4 orders of magnitude lower than the linear random 
fountain code for the same overhead in a channel with a packet 
loss probability of e = 5 ■ 10^^. Moreover, it is illustrated how 
the performance of the concatenated fountain code approaches 
that of an idealized fountain code for higher-order Galois fields 
and moderate packet loss probabilities. The scheme introduced 
is of special interest for the distribution of data using small block 
sizes. 

L Introduction 

Fountain codes were introduced in [1] as an efficient alter- 
native to automatic retransmission query ( |ARQ| i protocols in 
multicast/broadcast transmission systems. Consider the case 
where a sender (or source) needs to deliver a file to a set 
of Nu users. Consider furthermore the case where users are 
affected by packet losses. In this scenario, the usage of an 
|ARQ| protocol can result in large inefficiencies, since users 
may loose different packets, and hence a large number of 
retransmissions would crowd the downlink channel. Among 
the efficient (coded) alternatives to |ARQ| protocols [2]-[5], 
we shall focus on fountain codes only. When a fountain code 
is used, the source file is split in a set of k source packets. The 
sender, or fountain encoder, computes linear combinations of 
the k source packets and broadcasts them through the com- 
munication medium. After receiving k fountain coded packets, 
receivers can try to recover the source packets. If they fail to 
recover the source packets they will try again to decode when 
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they receive additional packets. The efficiency of a fountain 
code deals with the amount of packets (source-nredundancy) 
that a user needs to collect for recovering the source file. An 
idealized fountain code would allow the file recovery with a 
probability of success P, = 1 from any set of k received 
packets. Real fountain decoders need in general to receive a 
larger amount of packets, m = k + S, for achieving a certain 
success probability. Commonly, S is referred to as overhead 
of the fountain code, and is used to measure its efficiency. 
More generally a universal fountain code is a code which can 
recover the k original source symbols out of k + 5 symbols for 
any erasure channel and S small. The first class of universal 
fountain codes are Luby-transform dLTb codes [6]. One sub- 
class of ILTI codes are random ILTI codes or linear random 
fountain codes dLRFCfe ) [7]. When a binary ILRFCI is used 
[8], [9] the success probability can be accurately modeled as 
Ps = l- 2-^ for (5 > 2 (it can be proved that is actually 
always lower bounded by 1 — 2~^, [9]). In [9] it was shown 
that this expression is still accurate for fountain codes based on 
sparse matrices (e.g.. Raptor codes [7]). Moreover, in [9], the 
performance achievable by performing linear combinations of 
packets on Galois fields of order greater than 2 was analyzed. 
For a ILRFCI performing the linear combinations over F^, the 
decoding failure probability = 1 — Pg is bounded by [9] 
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where both bounds are tight for increasing q. Furthermore, in 
[9] it was also shown that non-binary Raptor codes can in fact 
tightly approach the bounds ([T]i down to moderate error rates. 

The result is remarkable, considering that for a Raptor 
code the encoding and decoding costs (defined as the number 
of arithmetic operations divided by the number of source 
symbols, k) are C'(log(l/a)) and 0{k log(l/a)) respectively, 
being k{l+a) the number of output symbols needed to recover 
the source symbols with a high probability. For a ILRFCI the 
encoding cost is 0{k) and the decoding cost is 0{k'^), and 
thus it does not scale favorably with the input block size. 



However, if the block size is kept small, the decoding cost 
is still affordable. 

The motivation of this paper is the analysis of a further 
improvement of the approach proposed in [9] for designing 
fountain codes with good performance for short block sizes. 
As in [9], in order to achieve the objective non-binary fountain 
codes are considered. Moreover, maximum distance separable 
jMDSI ) codes are introduced in parallel concatenation with the 
fountain encoder to enhance the performance of the scheme. 
By doing that, the first n output symbols of the encoder are 
the n output symbols of the IMPS I codeU 
In this paper, we illustrate how the performance of ILRFCb 
in terms of probability of decoding failure can be further 
improved by such a concatenation. An analytical expression 
for the decoding failure probability vs. overhead will be 
derived under the assumption of maximum-Ukelihood dMLb 
decoding. We show how, when the packet loss rates are 
moderate-low, the probability of failure can be reduced by 
several orders of magnitude, approaching the performance 
of idealized fountain codes. The simulated performance of 
schemes based on Reed Solomon ( IRSb codes are compared 
with the proposed expressions, confirming the accuracy of the 
proposed approach. The analysis is developed for the case of 
ILRFCfe . We conjecture that similar gains shall be expected also 
in the case where (non-binarv) ILTI codes are employed in the 
concatenation. 

The paper is organized as follows. In Section the pro- 
posed concatenated scheme is introduced. In Section |lll] the 
performance analysis is provided, while numerical results are 
presented in Section |IV] Conclusions follow in Section |V] 

II. Concatenation of Block Codes with Random 
Linear Fountain Codes 

Without loosing in generality, we define the source block 
u = (wi, M2, . . . , Uk) as a sequence of symbols belonging to a 
Galois field of order q, i.e. u e F^. In the proposed approach, 
the source block is first encoded via a {n, k) systematic linear 
block code C over with generator matrix G' = (I|P')> 
where I is the k x k identity matrix and P' is a fc x (n — 
k) matrix with elements in F^. The encoded block is hence 
given by c' = uG' = {c[,c'2, . . . , c^), where c[ = ui, — 
M2 , • ■ • , c'j, = Uk and the remaining n — k symbols of c' are 
the redundancy symbols given by (cj.^^ , cj,^2 1 • • • > c^J = uP'. 
Additional redundancy symbols can be obtained by computing 
random linear combinations of the k source symbols as 



Ci = c. 



'Ln=^9j.tUj, i = n + l,...,l 



where the coefficients ^ are picked from ¥q with a uniform 
probability (l/q). The encoded sequence is hence given by 

'Note that for Raptor codes the output of the precode is further encoded 
by a ILTI Code. Hence the first n output symbols of the fountain encoder are 
not the output of the precode. 

^We will assume a lMDSI linear block code constructed on the same field 
(Fg) of the fountain code. 



c = (c'|c"). The over 
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all generator matrix has the form 

9l,n 9l,n+l 9l,n+2 ■ ■ ■ 91,1 \ 
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9k,n 



(2) 
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G' G" 

where G" is the generator matrix of the ILRFCI (Note that, 
being the ILRFCI rate-less, the number I of columns of G can 
in principle grow indefinitely.) The encoder can be seen hence 
as a parallel concatenation of the linear block code C and of 
a ILRFCI (Fig. [TTi. 
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Fig. 1. Fountain coding scheme seen as a parallel concatenation of a (n, fc) 
linear block code and a linear random fountain code. 



III. Performance Analysis 

Based on the bounds derived in [9], tight upper and lower 
bounds for the decoding failure probability of the fountain 
coding scheme can be derived in case of uncorrected erasures. 
The decoding failure probability (Pp = Pr{F}, where F 
denotes the decoding failure event) is defined as the probability 
that the source block u cannot be recovered out of a set of 
received symbols. In this paper we will focus on the case 
where the linear block code used in concatenation with the 
ILRFCI is maximum distance separable (MDS). When binary 
codes will be used, we will assume (fc -I- l,fc) single-parity- 
check dSPCb codesH When operating on higher order Galois 
fields, we will consider (shortened) IRS I codes. 

The encoded sequence is given by c = uG = 
(ci, C2, . . . , c;), where the first n symbols (ci, C2, . . . , c„) 
represent a codeword of C, and the remaining I ~ n aie 
produced by the ILRFCI At the receiver side, a subset of m 
symbols is received. We denote by J = {ji, j2, ■ ■ ■ ,jm} the 
set of the indexes on the symbols of c that have been received. 
The received vector y is hence given by 

y = (j/1,2/2, . . . = (Cji,Cj,, . . . ,Cj„) 

and it can be related to the source block u as y = uG. Here, 
G denotes the k x m matrix made by the columns of G with 

'Repetition codes are not considered here, since they would lead to a trivial 
fountain scheme where the source block is given by 1 symbol only. 



indexes in J, i.e. 
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The recovery of u reduces to solving the system of m = k + 6 
linear equations in k unknowns 

G^u^ = y^, (3) 

e.g., via Gaussian elimination. The solution is possible if and 
only if rank(G) = k. 

Assuming C being IMP SI the system is solvable with 
probability 1 if, among the m received symbols, at least k 
have indexes in {1, 2, . . . , n}, i.e. if at least m' >k symbols 
produced by the hnear block encoder have been received. 

Let's consider the less trivial case where m! < k among 
the m received symbols have indexes in {1,2, ... ,n}. We 
can partition G^ as 
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The imps] propertv of C assures that rank(G') ~ m', i.e. the 
first m' rows of G^ are linearly independent. Note that the 
m" X k matrix G"-^ (with m" — ra — m!) is a random matrix 
whose entries are picked with uniform probability in Fg. It 
follows that the system defined by (|4|i can be put (via column 
permutations over G-'" and row permutations/combinations 
over G'^) in the form 



(5) 



where I is the m! x m' identity matrix, is a m" x m! all-0 
matrix, and A, B have respective sizes m! x [k — m') and 
m" X {k — m'). Note that the lower part of G"^ given by (0 B) 
is obtained by adding to each row of G"^ a linear combination 
of rows from G'^, in a way that the m! leftmost columns of 
G"^ are zeroed-out. It follows that the statistical properties of 
G""^ are inherited by the to" x (fc — to') sub-matrix B, whose 
entries are hence picked with uniform probability in Fg. The 
system is solvable if and only if B is full rank, i.e. if and only 
if rank(B) = fc — to'. 

Suppose now that the encoded symbols c are sent to a 
receiver over an erasure channel which erasure probability of e. 
The probability that at least k symbols out of the n symbols 
produced by the hnear block code encoder are received is 
given by 



Hence, with a probability P*{e) = 1 — Q*{e) the receiver 
would need to collect symbols encoded by the ILRFCl encoder 
to recover the source block. Assuming that the user collects 
TO = fc + (5 symbols, out of which only to' < fc have been 
produced by the linear block encoder, the conditional decoding 
failure probability can be expressed as 



Pr(F|TO', to' <k,6)^ Pr(rank(B) < fc - to'). 



(7) 



Note that B is a to" x (fc - to') = {k + S — to') x (fc — to') 
random matrix, i.e. a random matrix with 5 equations in excess 
w.r.t. the number of unknowns. We can thus replace ([T]i in (Q, 
getting the bounds 

q-S-i < pj.(^p\jn'^ j^' < fc^ ^)_JL_g-'5. (8) 

We remark that, thanks to the independency of the bounds in 
([T]i from the size of the random matrix (i.e., the bounds depend 
only on the overhead), we can remove the conditioning on to' 
from (O, leaving 

1 



-<5-l 



< Pr(F|TO' < fc, ^) < 



g-1 



The final failure probability can be written as 

PF{S,e) = Pr(F|TO' < fc,(5)Pr(TO' < fc) + 
+Pr(i^|TO' > k,S)PT{m' > fc), 



(9) 



where Pr(i^|TO' > k,6) = and Pr(TO' < fc) = P*(e). It 
results that 

P*(e)g-*'-i < PF{S,e) < P*{e)^q-'. (10) 

q - 1 

From an inspection of ([T]i and ( fTOl i, one can note how 
the bounds on the failure probability of the concatenated 
scheme are scaled down by a factor P*(e), where P*{e) = 
Si=o^ (")(1 ^ e)'e"^* is a monotonically increasing function 
of e. It follows that, when the channel conditions are bad 
(i.e., large e) P*{e) 1, and the bounds in (fTOl i tend to 
coincide with the bounds in ([T]i. When the channel conditions 
are good (i.e., small e), most of the time to' > fc symbols 
produced by the linear block encoder are received, leading to 
a decoding success (recall the assumption of IMPSI code). In 
these conditions, P*{e) <^ 1, and according to the bounds in 
([Tol l the failure probability may scale down even of several 
orders of magnitude. 

Fig.|2]shows the probability of decoding failure as a function 
of the number of overhead symbols for a concatenated code 
built using a (11, 10) ISPCI code in F2. It can be observed how, 
for lower erasure probabilities, the performance gain in terms 
of probability of decoding failure increases. For e = 0.01 the 
decoding failure probability is more than 2 orders of magni- 
tude lower. Fig. [3] shows the probability of decoding failure 
vs. the number of overhead symbols for the concatenation of 
a (15, 10) |RS]and a ILRFCl over Fig. The performance of the 
concatenated code is compared with that of the ILRFCl built 
on the same field for different erasure probabilities. In this 
case the decrease in terms of probability of decoding failure 
is bigger than in for the previously presented code in F2. For 



a channel with an erasure probabiUty e — 0.05, the probabiHty 
of decoding failure of the concatenated scheme is 4 orders of 
magnitude lower than for the ILRFCI 

The analysis provided in this section is also valid if the 
ILRFCI is substituted by a ILTI or Raptor code. In order to 
calculate the performance of such a concatenated code one has 
to substitute in ^ the term Pr(F|r7i' < fc, (5) by the probability 
of decoding failure of the ILTI or Raptor code. Again the failure 
probability of the concatenated scheme is scaled down by a 
factor P*(e), where P*(e) < 1. 

IV. Numerical Results 

Fig. |4] shows the results of simulations together with the 
bounds calculated using ( fTOb . In this case a (15, 10) IRSI was 
concatenated with a ILRFCI in Fig, and a channel with an 
erasure probability e = 0.1 was used. It can be seen how 
the simulation results match the analytical results down to 
a probability of decoding failure of 10^^. Fig. |5] shows the 
simulation results for a concatenated code using a (11,10) 
parity check code in F2, and a channel with an erasure 
probability e = 0.1. It can be seen how the simulation results 
match the analytical results again. However, in F2 the bounds 
are less tight than in higher order Galois fields. 

An assessment the performance of the concatenated scheme 
in a system with a high number of users has been performed, 
assuming a system in which a transmitter sends a source block 
to a set of N receivers. We considered the erasure channels 
from the transmitter to the receivers to be independent, with an 
identical erasure probability e. Furthermore, we assumed that 
the receivers send an acknowledgement to the transmitter when 
they have successfully decoded the block. Ideal (error-free) 
feedback channels have been considered. When all receivers 
have sent an acknowledgement, the transmitter stops encoding 
redundant symbols for the source block. 

If fc+ A (where A denotes the transmitter overhead) symbols 
have been transmitted, the probability that a specific receiver 
gathers exactly to symbols is: 

Pn{k + A, to} = + (1 - e)™efe+A-" (11) 

The probability of decoding failure at the receiver given that 
the transmitter has sent /c + A symbols is hence 

fe-i 

P-=Y^ ^i?{fc + A,TO} + 

m=0 

+ ^ PR{fc + A,TO}FF{'5 = TO-fc, e}. 

va—h 

The probability that at least one user has not decoded success- 
fully is thus 

PE{N,^,e) = \-{\-P,f (12) 

Using the bounds in (fTOl i Pe{N, A, e) can also be bounded. In 
the following we provide an example to asses the performance 
of the new scheme in comparison with ILRFCI codes and also 
with an idealized fountain code. We assume a system with 



N = users and a channel with an erasure probability 
e ~ 0.01. The performance of ILRFCI codes over F2 and 
F16 is shown as well as that of two concatenated schemes: 
a concatenation of a (11, 10) ISPCI code with a ILRFCI code in 
F2, and a concatenation of a (15, 10) IRSI code and a ILRFCI 
code over Fig. It can be seen how the concatenated scheme 
in F2 outperforms the ILRFCI constructed on the same Galois 
field. For example, for Pe = 10^^ the concatenated scheme in 
F2 needs only A = 20 overhead symbols whereas the ILRFCI 
needs 27 (Fig. |6ll. In the case of the fountain codes operating 
in F16, the concatenated code shows a performance very close 
to that of an idealized fountain code. 
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Fig. 2 . Pf{S, e) vs. overhead for a concatenated code built using a (11, 10) 
ISPCI code over F2 for different values of e. Upper bounds are represented by 
solid lines and lower bounds are represented by dashed lines. 



10° 




01 23456789 10 

8 

Fig. 3. Pf{S, e) vs. overhead for a concatenated code built using a (15, 10) 
IRS I over Fis for different values of e. Upper bounds are represented by solid 
lines and lower bounds are represented by dashed hnes. 

V. Conclusions 

A novel fountain coding scheme has been introduced. The 
scheme consists of a parallel concatenation of a IMPS I block 
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Fig. 4. Pf{S,€) vs. overhead for a the concatenation of a (15, 10) IRSI and 
ILRFCI over Fib and e = 0.1. Upper bounds are represented by solid lines 
and lower bounds are represented by dashed lines. The points marked with 
'o' denote actual simulations. 



Fig. 6. Pe vs. overhead at the transmitter in a system with = 10000 
use rs and e = 0.01 . Results are shown for different fountain c odesilLRFCl in 
F^. ILRFCl in Fie, concatenation of a (ll.lO) ISPCl cod e with a|LRFCI code in 
F2, and a concatenation of a (15, 10) IRSI code and a ILRFCI code over Fis. 
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Fig. 5. Pp{5,€) vs. overhead symbols for a the concatenation of a 
(11. lO lSPCl code and a ILRFCI over F2 and e = 0.1. Upper bounds are 
represented by sohd hnes and lower bounds are represented by dashed lines. 
The points marked with 'o' denote actual simulations. 



code with a ILRFCI code, both constructed over the same 
field, ¥q. The performance of the concatenated fountain coding 
scheme has been analyzed through derivation of tight bounds 
on the probability of decoding failure as a function of the 
overhead. It has been shown how the concatenated scheme 
performs as well as ILRFCl codes in channels characterized by 
high erasure probabilities, whereas they provide failure proba- 
bilities lower by several orders of magnitude at moderate/low 
erasure probabilities. 
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